Using key phrases as new queries in building relevance judgments automatically

نویسندگان

  • Mireille Makary
  • Michael P. Oakes
  • Fadi Yamout
چکیده

We describe a new technique for building a relevance judgment list (qrels) for TREC test collections with no human intervention. For each TREC topic, a set of new queries is automatically generated from key phrases extracted from the top k documents retrieved from 12 different Terrier weighting models when the initial TREC topic is submitted. We assign a score to each key phrase based on its similarity to the original TREC topic. The key phrases with the highest scores become the new queries for a second search, this time using the Terrier BM25 weighting model. The union of the documents retrieved forms the automatically-build set of qrels.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatically Building a Repository to Support Evidence Based Practice

Our long-term goal is to find, store, update, and provide access to key facts needed to support clinical decision making. Presently, the facts are extracted automatically from clinical narrative and biomedical literature sources, primarily MEDLINE, and stored in the Repository for Informed Decision Making. We envision expert community validation of the extracted facts and peer-reviewed direct d...

متن کامل

CLEF 2003 Experiments at UB: Automatically Generated Phrases and Relevance Feedback for Improving CLIR

This paper presents the results obtained by the University at Buffalo (UB) in CLEF 2003. Our efforts concentrated in the monolingual retrieval and large multilingual retrieval tasks. We used a modified version of the SMART system, a heuristic method based on bigrams to generate phrases that works across multiple languages, and pseudo relevance feedback. Query translation was performed using pub...

متن کامل

Selecting a Subset of Queries for Acquisition of Further Relevance Judgements

Assessing the relative performance of search systems requires the use of a test collection with a pre-defined set of queries and corresponding relevance assessments. The state-ofthe-art process of constructing test collections involves using a large number of queries and selecting a set of documents, submitted by a group of participating systems, to be judged per query. However, the initial set...

متن کامل

Text Retrieval and Routing Techniques Based on an Inference Net

The TIPSTER detection project at the University of Massachusetts is focusing on information retrieval and routing techniques for large, full-text databases, including Japanese. The project approach is to use improved representations of text and information needs in the framework of a probabilistic inference net model of retrieval. In this project, retrieval (and routing) is viewed as a probabil...

متن کامل

An Evaluation of Multiple Query Representations for the Relevance Judgments used to Build a Biomedical Test Collection

OBJECTIVES The purpose of this study is to validate a method that uses multiple queries to create a set of relevance judgments used to indicate which documents are pertinent to each query when forming a biomedical test collection. METHODS The aspect query is the major concept of this research; it can represent every aspect of the original query with the same informational need. Manually gener...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016